Statistics is the science which deals with the collection, the analysis, the visualization and the interpretation of experimental data.
Statistics is the science which deals with the collection, the analysis, the visualization and the interpretation of experimental data.
⚠️ Definition:
Random samplings allow to characterize the properties of a finite population without measuring all of its members.
Examples
⚠️ Definition:
Observational studies are designed with the objective of identifying relationships between the different properties of a conceptual population. The role of the experimenter is to perform the selection of the sample.
Examples
⚠️ Definition:
Experiments are designed with the objective of identifying causal relations between the properties of a conceptual population. The role of the experimenter is to modify the conditions to verify the presence of causal relationship between the observed properties.
Examples
Causal relations can be assessed only in experiments
This is really Galileian ;-)
Experiments are impossible in many relevant fields like human health and ecology
Should we then give up on obtaining causal information there?
New England Journal of Medicine, 2012
” … Chocolate consumption enhances cognitive function, which is a sine qua non for winning the Nobel Prize, and it closely correlates with the number of Nobel laureates in each country …“
⚠️ Key question
What is the best way to sample my population in a representative way?
In presence of known subpopulations stratified random sampling results in a more accurate characterization of the population
The most reasonable way to “smear out” the effects of unknown biases is to do everything randomly
Random is not a synonym of HAPHAZARD
Objective: get an useful and clear answer.
Mean: start from a clear, useful and often simple question.
The smaller unit of a population which retains the properties we are interested into
A variable that influences both the dependent variable and independent variable, causing a spurious association (wikipedia).
Objective: get an useful and clear answer.
Mean: start from a clear, useful and often simple question.
⚠️ Definition
A strategy to assign the experimental units to the different treatments to optimize my capacity of inferring causal relationships
Control of unwanted sources of variability (technical/biological) to highlight the effects of the intervention
Block what you can; randomize what you cannot …
Block as much as possible!
Repeated measures are more “powerful” because each unit is the control of itself
Crossovers can be tricky for the wash-out
Repeated measures design are the key in presence of large variability in the population (e.g. plants in the field/greenhouse)